Pesquisa | Portal Regional da BVS

1.

Identifying and overcoming COVID-19 vaccination impediments using Bayesian data mining techniques.

Lei, Bowen; Mahajan, Arvind; Mallick, Bani.

Sci Rep ; 14(1): 8595, 2024 04 13.

Artigo em Inglês | MEDLINE | ID: mdl-38615084

RESUMO

The COVID-19 pandemic has profoundly reshaped human life. The development of COVID-19 vaccines has offered a semblance of normalcy. However, obstacles to vaccination have led to substantial loss of life and economic burdens. In this study, we analyze data from a prominent health insurance provider in the United States to uncover the underlying reasons behind the inability, refusal, or hesitancy to receive vaccinations. Our research proposes a methodology for pinpointing affected population groups and suggests strategies to mitigate vaccination barriers and hesitations. Furthermore, we estimate potential cost savings resulting from the implementation of these strategies. To achieve our objectives, we employed Bayesian data mining methods to streamline data dimensions and identify significant variables (features) influencing vaccination decisions. Comparative analysis reveals that the Bayesian method outperforms cutting-edge alternatives, demonstrating superior performance.

Assuntos

COVID-19 , Humanos , Teorema de Bayes , COVID-19/epidemiologia , COVID-19/prevenção & controle , Vacinas contra COVID-19 , Pandemias , Mineração de Dados , Vacinação

2.

Natural language processing (NLP) to facilitate abstract review in medical research: the application of BioBERT to exploring the 20-year use of NLP in medical research.

Masoumi, Safoora; Amirkhani, Hossein; Sadeghian, Najmeh; Shahraz, Saeid.

Syst Rev ; 13(1): 107, 2024 Apr 15.

Artigo em Inglês | MEDLINE | ID: mdl-38622611

RESUMO

BACKGROUND: Abstract review is a time and labor-consuming step in the systematic and scoping literature review in medicine. Text mining methods, typically natural language processing (NLP), may efficiently replace manual abstract screening. This study applies NLP to a deliberately selected literature review problem, the trend of using NLP in medical research, to demonstrate the performance of this automated abstract review model. METHODS: Scanning PubMed, Embase, PsycINFO, and CINAHL databases, we identified 22,294 with a final selection of 12,817 English abstracts published between 2000 and 2021. We invented a manual classification of medical fields, three variables, i.e., the context of use (COU), text source (TS), and primary research field (PRF). A training dataset was developed after reviewing 485 abstracts. We used a language model called Bidirectional Encoder Representations from Transformers to classify the abstracts. To evaluate the performance of the trained models, we report a micro f1-score and accuracy. RESULTS: The trained models' micro f1-score for classifying abstracts, into three variables were 77.35% for COU, 76.24% for TS, and 85.64% for PRF. The average annual growth rate (AAGR) of the publications was 20.99% between 2000 and 2020 (72.01 articles (95% CI: 56.80-78.30) yearly increase), with 81.76% of the abstracts published between 2010 and 2020. Studies on neoplasms constituted 27.66% of the entire corpus with an AAGR of 42.41%, followed by studies on mental conditions (AAGR = 39.28%). While electronic health or medical records comprised the highest proportion of text sources (57.12%), omics databases had the highest growth among all text sources with an AAGR of 65.08%. The most common NLP application was clinical decision support (25.45%). CONCLUSIONS: BioBERT showed an acceptable performance in the abstract review. If future research shows the high performance of this language model, it can reliably replace manual abstract reviews.

Assuntos

Pesquisa Biomédica , Processamento de Linguagem Natural , Humanos , Idioma , Mineração de Dados , Registros Eletrônicos de Saúde

3.

Text mining of hypertension researches in the west Asia region: a 12-year trend analysis.

Rezapour, Mohammad; Yazdinejad, Mohsen; Rajabi Kouchi, Faezeh; Habibi Baghi, Masoomeh; Khorrami, Zahra; Khavanin Zadeh, Morteza; Pourbaghi, Elmira; Rezapour, Hassan.

Ren Fail ; 46(1): 2337285, 2024 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-38616180

RESUMO

More than half of the world population lives in Asia and hypertension (HTN) is the most prevalent risk factor found in Asia. There are numerous articles published about HTN in Eastern Mediterranean Region (EMRO) and artificial intelligence (AI) methods can analyze articles and extract top trends in each country. Present analysis uses Latent Dirichlet allocation (LDA) as an algorithm of topic modeling (TM) in text mining, to obtain subjective topic-word distribution from the 2790 studies over the EMRO. The period of checked studied is last 12 years and results of LDA analyses show that HTN researches published in EMRO discuss on changes in BP and the factors affecting it. Among the countries in the region, most of these articles are related to I.R Iran and Egypt, which have an increasing trend from 2017 to 2018 and reached the highest level in 2021. Meanwhile, Iraq and Lebanon have been conducting research since 2010. The EMRO word cloud illustrates 'BMI', 'mortality', 'age', and 'meal', which represent important indicators, dangerous outcomes of high BP, and gender of HTN patients in EMRO, respectively.

Assuntos

Inteligência Artificial , Hipertensão , Humanos , Mineração de Dados , Algoritmos , Ásia/epidemiologia , Hipertensão/epidemiologia

4.

[Element relationship and extension path of clinical evidence knowledge map with Chinese patent medicine].

Ji, Zhao-Chen; Hu, Hai-Yin; Peng, De-Hui; Wang, Dan-Lei; Wu, Xiao-Lei; Feng, Chao-Nan; Zhang, Jun-Hua.

Zhongguo Zhong Yao Za Zhi ; 49(3): 836-841, 2024 Feb.

Artigo em Chinês | MEDLINE | ID: mdl-38621887

RESUMO

This study aims to construct the element relationship and extension path of clinical evidence knowledge map with Chinese patent medicine, providing basic technical support for the formation and transformation of the evidence chain of Chinese patent medicine and providing collection, induction, and summary schemes for massive and disorganized clinical data. Based on the elements of evidence-based PICOS, the conventional construction methods of knowledge graph were collected and summarized. Firstly, the data entities related to Chinese patent medicine were classified, and entity linking was performed(disambiguation). Secondly, the study associated and classified the attribute information of the data entity. Finally, the logical relationship between entities was constructed, and then the element relationship and extension path of the knowledge map conforming to the characteristics of clinical evidence of Chinese patent medicine were summarized. The construction of the clinical evidence knowledge map of Chinese patent medicine was mainly based on process design and logical structure, and the element relationship of the knowledge map was expressed according to the PICOS principle and evidence level. The extension path crossed three levels(model layer, data layer application, and new evidence application), and the study gradually explored the path from disease, core evaluation indicators, Chinese patent medicine, core prescriptions, syndrome and treatment rules, and medical case comparison(evolution law) to new drug research and development. In this study, the top-level design of the construction of the clinical evidence knowledge map of Chinese patent medicine has been clarified, but it still needs the joint efforts of interdisciplinary disciplines. With the continuous improvement of the map construction technology in line with the characteristics of TCM, the study can provide necessary basic technical support and reference for the development of the TCM discipline.

Assuntos

Medicamentos de Ervas Chinesas , Medicamentos de Ervas Chinesas/uso terapêutico , Medicina Tradicional Chinesa , Medicamentos sem Prescrição/uso terapêutico , Tecnologia , Mineração de Dados/métodos

5.

Identifying cancer patients who received palliative care using the SPICT-LIS in medical records: a rule-based algorithm and text-mining technique.

Limsomwong, Pawita; Ingviya, Thammasin; Fumaneeshoat, Orapan.

BMC Palliat Care ; 23(1): 83, 2024 Apr 01.

Artigo em Inglês | MEDLINE | ID: mdl-38556869

RESUMO

BACKGROUND: Due to limited numbers of palliative care specialists and/or resources, accessing palliative care remains limited in many low and middle-income countries. Data science methods, such as rule-based algorithms and text mining, have potential to improve palliative care by facilitating analysis of electronic healthcare records. This study aimed to develop and evaluate a rule-based algorithm for identifying cancer patients who may benefit from palliative care based on the Thai version of the Supportive and Palliative Care Indicators for a Low-Income Setting (SPICT-LIS) criteria. METHODS: The medical records of 14,363 cancer patients aged 18 years and older, diagnosed between 2016 and 2020 at Songklanagarind Hospital, were analyzed. Two rule-based algorithms, strict and relaxed, were designed to identify key SPICT-LIS indicators in the electronic medical records using tokenization and sentiment analysis. The inter-rater reliability between these two algorithms and palliative care physicians was assessed using percentage agreement and Cohen's kappa coefficient. Additionally, factors associated with patients might be given palliative care as they will benefit from it were examined. RESULTS: The strict rule-based algorithm demonstrated a high degree of accuracy, with 95% agreement and Cohen's kappa coefficient of 0.83. In contrast, the relaxed rule-based algorithm demonstrated a lower agreement (71% agreement and Cohen's kappa of 0.16). Advanced-stage cancer with symptoms such as pain, dyspnea, edema, delirium, xerostomia, and anorexia were identified as significant predictors of potentially benefiting from palliative care. CONCLUSION: The integration of rule-based algorithms with electronic medical records offers a promising method for enhancing the timely and accurate identification of patients with cancer might benefit from palliative care.

Assuntos

Neoplasias , Cuidados Paliativos , Humanos , Reprodutibilidade dos Testes , Registros Eletrônicos de Saúde , Neoplasias/terapia , Mineração de Dados , Algoritmos

6.

Adaptive neighborhood rough set model for hybrid data processing: a case study on Parkinson's disease behavioral analysis.

Raza, Imran; Jamal, Muhammad Hasan; Qureshi, Rizwan; Shahid, Abdul Karim; Vistorte, Angel Olider Rojas; Samad, Md Abdus; Ashraf, Imran.

Sci Rep ; 14(1): 7635, 2024 04 01.

Artigo em Inglês | MEDLINE | ID: mdl-38561391

RESUMO

Extracting knowledge from hybrid data, comprising both categorical and numerical data, poses significant challenges due to the inherent difficulty in preserving information and practical meanings during the conversion process. To address this challenge, hybrid data processing methods, combining complementary rough sets, have emerged as a promising approach for handling uncertainty. However, selecting an appropriate model and effectively utilizing it in data mining requires a thorough qualitative and quantitative comparison of existing hybrid data processing models. This research aims to contribute to the analysis of hybrid data processing models based on neighborhood rough sets by investigating the inherent relationships among these models. We propose a generic neighborhood rough set-based hybrid model specifically designed for processing hybrid data, thereby enhancing the efficacy of the data mining process without resorting to discretization and avoiding information loss or practical meaning degradation in datasets. The proposed scheme dynamically adapts the threshold value for the neighborhood approximation space according to the characteristics of the given datasets, ensuring optimal performance without sacrificing accuracy. To evaluate the effectiveness of the proposed scheme, we develop a testbed tailored for Parkinson's patients, a domain where hybrid data processing is particularly relevant. The experimental results demonstrate that the proposed scheme consistently outperforms existing schemes in adaptively handling both numerical and categorical data, achieving an impressive accuracy of 95% on the Parkinson's dataset. Overall, this research contributes to advancing hybrid data processing techniques by providing a robust and adaptive solution that addresses the challenges associated with handling hybrid data, particularly in the context of Parkinson's disease analysis.

Assuntos

Algoritmos , Doença de Parkinson , Humanos , Mineração de Dados/métodos , Incerteza

7.

Assessment of transparency indicators in space medicine.

Bellomo, Rosa Katia; Zavalis, Emmanuel A; Ioannidis, John P A.

PLoS One ; 19(4): e0300701, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38564591

RESUMO

Space medicine is a vital discipline with often time-intensive and costly projects and constrained opportunities for studying various elements such as space missions, astronauts, and simulated environments. Moreover, private interests gain increasing influence in this discipline. In scientific disciplines with these features, transparent and rigorous methods are essential. Here, we undertook an evaluation of transparency indicators in publications within the field of space medicine. A meta-epidemiological assessment of PubMed Central Open Access (PMC OA) eligible articles within the field of space medicine was performed for prevalence of code sharing, data sharing, pre-registration, conflicts of interest, and funding. Text mining was performed with the rtransparent text mining algorithms with manual validation of 200 random articles to obtain corrected estimates. Across 1215 included articles, 39 (3%) shared code, 258 (21%) shared data, 10 (1%) were registered, 110 (90%) contained a conflict-of-interest statement, and 1141 (93%) included a funding statement. After manual validation, the corrected estimates for code sharing, data sharing, and registration were 5%, 27%, and 1%, respectively. Data sharing was 32% when limited to original articles and highest in space/parabolic flights (46%). Overall, across space medicine we observed modest rates of data sharing, rare sharing of code and almost non-existent protocol registration. Enhancing transparency in space medicine research is imperative for safeguarding its scientific rigor and reproducibility.

Assuntos

Medicina Aeroespacial , Reprodutibilidade dos Testes , Disseminação de Informação , PubMed , Mineração de Dados

8.

Advancing the allergenicity assessment of new proteins using a text mining resource.

Novoa, Jorge; Fernandez-Dumont, Antonio; Mills, E N Clare; Moreno, F Javier; Pazos, Florencio.

Food Chem Toxicol ; 187: 114638, 2024 May.

Artigo em Inglês | MEDLINE | ID: mdl-38582341

RESUMO

With a society increasingly demanding alternative protein food sources, new strategies for evaluating protein safety issues, such as allergenic potential, are needed. Large-scale and systemic studies on allergenic proteins are hindered by the limited and non-harmonized clinical information available for these substances in dedicated databases. A missing key information is that representing the symptomatology of the allergens, especially given in terms of standard vocabularies, that would allow connecting with other biomedical resources to carry out different studies related to human health. In this work, we have generated the first resource with a comprehensive annotation of allergens' symptomatology, using a text-mining approach that extracts significant co-mentions between these entities from the scientific literature (PubMed, â¼36 million abstracts). The method identifies statistically significant co-mentions between the textual descriptions of the two types of entities in the literature as indication of relationship. 1,180 clinical signs extracted from the Human Phenotype Ontology, the Medical Subject Heading terms of PubMed together with other allergen-specific symptoms, were linked to 1,036 unique allergens annotated in two main allergen-related public databases via 14,009 relationships. This novel resource, publicly available through an interactive web interface, could serve as a starting point for future manually curated compilation of allergen symptomatology.

Assuntos

Alérgenos , Mineração de Dados , Humanos , Mineração de Dados/métodos , Bases de Dados Factuais , Proteínas/metabolismo

9.

Public Discourse, User Reactions, and Conspiracy Theories on the X Platform About HIV Vaccines: Data Mining and Content Analysis.

Zhang, Jueman M; Wang, Yi; Mouton, Magali; Zhang, Jixuan; Shi, Molu.

J Med Internet Res ; 26: e53375, 2024 Apr 03.

Artigo em Inglês | MEDLINE | ID: mdl-38568723

RESUMO

BACKGROUND: The initiation of clinical trials for messenger RNA (mRNA) HIV vaccines in early 2022 revived public discussion on HIV vaccines after 3 decades of unsuccessful research. These trials followed the success of mRNA technology in COVID-19 vaccines but unfolded amid intense vaccine debates during the COVID-19 pandemic. It is crucial to gain insights into public discourse and reactions about potential new vaccines, and social media platforms such as X (formerly known as Twitter) provide important channels. OBJECTIVE: Drawing from infodemiology and infoveillance research, this study investigated the patterns of public discourse and message-level drivers of user reactions on X regarding HIV vaccines by analyzing posts using machine learning algorithms. We examined how users used different post types to contribute to topics and valence and how these topics and valence influenced like and repost counts. In addition, the study identified salient aspects of HIV vaccines related to COVID-19 and prominent anti-HIV vaccine conspiracy theories through manual coding. METHODS: We collected 36,424 English-language original posts about HIV vaccines on the X platform from January 1, 2022, to December 31, 2022. We used topic modeling and sentiment analysis to uncover latent topics and valence, which were subsequently analyzed across post types in cross-tabulation analyses and integrated into linear regression models to predict user reactions, specifically likes and reposts. Furthermore, we manually coded the 1000 most engaged posts about HIV and COVID-19 to uncover salient aspects of HIV vaccines related to COVID-19 and the 1000 most engaged negative posts to identify prominent anti-HIV vaccine conspiracy theories. RESULTS: Topic modeling revealed 3 topics: HIV and COVID-19, mRNA HIV vaccine trials, and HIV vaccine and immunity. HIV and COVID-19 underscored the connections between HIV vaccines and COVID-19 vaccines, as evidenced by subtopics about their reciprocal impact on development and various comparisons. The overall valence of the posts was marginally positive. Compared to self-composed posts initiating new conversations, there was a higher proportion of HIV and COVID-19-related and negative posts among quote posts and replies, which contribute to existing conversations. The topic of mRNA HIV vaccine trials, most evident in self-composed posts, increased repost counts. Positive valence increased like and repost counts. Prominent anti-HIV vaccine conspiracy theories often falsely linked HIV vaccines to concurrent COVID-19 and other HIV-related events. CONCLUSIONS: The results highlight COVID-19 as a significant context for public discourse and reactions regarding HIV vaccines from both positive and negative perspectives. The success of mRNA COVID-19 vaccines shed a positive light on HIV vaccines. However, COVID-19 also situated HIV vaccines in a negative context, as observed in some anti-HIV vaccine conspiracy theories misleadingly connecting HIV vaccines with COVID-19. These findings have implications for public health communication strategies concerning HIV vaccines.

Assuntos

Vacinas contra a AIDS , COVID-19 , Infecções por HIV , Humanos , Vacinas contra COVID-19 , Pandemias , Mineração de Dados , COVID-19/epidemiologia , COVID-19/prevenção & controle , RNA Mensageiro , Infecções por HIV/prevenção & controle

10.

Text mining and portal development for gene-specific publications on Alzheimer's disease and other neurodegenerative diseases.

Liu, Jiannan; Wu, Huanmei; Robertson, Daniel H; Zhang, Jie.

BMC Med Inform Decis Mak ; 24(Suppl 3): 98, 2024 Apr 17.

Artigo em Inglês | MEDLINE | ID: mdl-38632621

RESUMO

BACKGROUND: Tremendous research efforts have been made in the Alzheimer's disease (AD) field to understand the disease etiology, progression and discover treatments for AD. Many mechanistic hypotheses, therapeutic targets and treatment strategies have been proposed in the last few decades. Reviewing previous work and staying current on this ever-growing body of AD publications is an essential yet difficult task for AD researchers. METHODS: In this study, we designed and implemented a natural language processing (NLP) pipeline to extract gene-specific neurodegenerative disease (ND) -focused information from the PubMed database. The collected publication information was filtered and cleaned to construct AD-related gene-specific publication profiles. Six categories of AD-related information are extracted from the processed publication data: publication trend by year, dementia type occurrence, brain region occurrence, mouse model information, keywords occurrence, and co-occurring genes. A user-friendly web portal is then developed using Django framework to provide gene query functions and data visualizations for the generalized and summarized publication information. RESULTS: By implementing the NLP pipeline, we extracted gene-specific ND-related publication information from the abstracts of the publications in the PubMed database. The results are summarized and visualized through an interactive web query portal. Multiple visualization windows display the ND publication trends, mouse models used, dementia types, involved brain regions, keywords to major AD-related biological processes, and co-occurring genes. Direct links to PubMed sites are provided for all recorded publications on the query result page of the web portal. CONCLUSION: The resulting portal is a valuable tool and data source for quick querying and displaying AD publications tailored to users' interested research areas and gene targets, which is especially convenient for users without informatic mining skills. Our study will not only keep AD field researchers updated with the progress of AD research, assist them in conducting preliminary examinations efficiently, but also offers additional support for hypothesis generation and validation which will contribute significantly to the communication, dissemination, and progress of AD research.

Assuntos

Doença de Alzheimer , Doenças Neurodegenerativas , Animais , Camundongos , Mineração de Dados/métodos , PubMed , Bases de Dados Factuais

11.

Rummagene: massive mining of gene sets from supporting materials of biomedical research publications.

Clarke, Daniel J B; Marino, Giacomo B; Deng, Eden Z; Xie, Zhuorui; Evangelista, John Erol; Ma'ayan, Avi.

Commun Biol ; 7(1): 482, 2024 Apr 20.

Artigo em Inglês | MEDLINE | ID: mdl-38643247

RESUMO

Many biomedical research publications contain gene sets in their supporting tables, and these sets are currently not available for search and reuse. By crawling PubMed Central, the Rummagene server provides access to hundreds of thousands of such mammalian gene sets. So far, we scanned 5,448,589 articles to find 121,237 articles that contain 642,389 gene sets. These sets are served for enrichment analysis, free text, and table title search. Investigating statistical patterns within the Rummagene database, we demonstrate that Rummagene can be used for transcription factor and kinase enrichment analyses, and for gene function predictions. By combining gene set similarity with abstract similarity, Rummagene can find surprising relationships between biological processes, concepts, and named entities. Overall, Rummagene brings to surface the ability to search a massive collection of published biomedical datasets that are currently buried and inaccessible. The Rummagene web application is available at https://rummagene.com .

Assuntos

Pesquisa Biomédica , Mineração de Dados , Animais , Software , Bases de Dados Factuais , Regulação da Expressão Gênica , Mamíferos

12.

Analysis of heavy metal and polycyclic aromatic hydrocarbon pollution characteristics of a typical metal rolling industrial site based on data mining.

Li, De'an; Deng, Yirong; Liu, LiLi; Wang, Jun; Huang, Zaoquan; Zhang, Xiaolu.

Environ Geochem Health ; 46(5): 146, 2024 Apr 05.

Artigo em Inglês | MEDLINE | ID: mdl-38578375

RESUMO

With the transformation and upgrading of industries, the environmental problems caused by industrial residual contaminated sites are becoming increasingly prominent. Based on actual investigation cases, this study analyzed the soil pollution status of a remaining sites of the copper and zinc rolling industry, and found that the pollutants exceeding the screening values included Cu, Ni, Zn, Pb, total petroleum hydrocarbons and 6 polycyclic aromatic hydrocarbon monomers. Based on traditional analysis methods such as the correlation coefficient and spatial distribution, combined with machine learning methods such as SOM + K-means, it is inferred that the heavy metal Zn/Pb may be mainly related to the production history of zinc rolling. Cu/Ni may be mainly originated from the production history of copper rolling. PAHs are mainly due to the incomplete combustion of fossil fuels in the melting equipment. TPH pollution is speculated to be related to oil leakage during the industrial use period and later period of vehicle parking. The results showed that traditional analysis methods can quickly identify the correlation between site pollutants, while SOM + K-means machine learning methods can further effectively extract complex hidden relationships in data and achieve in-depth mining of site monitoring data.

Assuntos

Poluentes Ambientais , Metais Pesados , Hidrocarbonetos Policíclicos Aromáticos , Poluentes do Solo , Cobre/análise , Hidrocarbonetos Policíclicos Aromáticos/análise , Chumbo/análise , Poluentes do Solo/análise , Metais Pesados/análise , Zinco/análise , Poluição Ambiental/análise , Solo , Poluentes Ambientais/análise , Mineração de Dados , Monitoramento Ambiental/métodos , China , Medição de Risco

13.

GPDminer: a tool for extracting named entities and analyzing relations in biological literature.

Park, Yeon-Ji; Yang, Geun-Je; Sohn, Chae-Bong; Park, Soo Jun.

BMC Bioinformatics ; 25(1): 101, 2024 Mar 06.

Artigo em Inglês | MEDLINE | ID: mdl-38448845

RESUMO

PURPOSE: The expansion of research across various disciplines has led to a substantial increase in published papers and journals, highlighting the necessity for reliable text mining platforms for database construction and knowledge acquisition. This abstract introduces GPDMiner(Gene, Protein, and Disease Miner), a platform designed for the biomedical domain, addressing the challenges posed by the growing volume of academic papers. METHODS: GPDMiner is a text mining platform that utilizes advanced information retrieval techniques. It operates by searching PubMed for specific queries, extracting and analyzing information relevant to the biomedical field. This system is designed to discern and illustrate relationships between biomedical entities obtained from automated information extraction. RESULTS: The implementation of GPDMiner demonstrates its efficacy in navigating the extensive corpus of biomedical literature. It efficiently retrieves, extracts, and analyzes information, highlighting significant connections between genes, proteins, and diseases. The platform also allows users to save their analytical outcomes in various formats, including Excel and images. CONCLUSION: GPDMiner offers a notable additional functionality among the array of text mining tools available for the biomedical field. This tool presents an effective solution for researchers to navigate and extract relevant information from the vast unstructured texts found in biomedical literature, thereby providing distinctive capabilities that set it apart from existing methodologies. Its application is expected to greatly benefit researchers in this domain, enhancing their capacity for knowledge discovery and data management.

Assuntos

Gerenciamento de Dados , Mineração de Dados , Bases de Dados Factuais , Descoberta do Conhecimento , PubMed

14.

Unobtrusive Cognitive Assessment in Smart-Homes: Leveraging Visual Encoding and Synthetic Movement Traces Data Mining.

Zolfaghari, Samaneh; Kristoffersson, Annica; Folke, Mia; Lindén, Maria; Riboni, Daniele.

Sensors (Basel) ; 24(5)2024 Feb 21.

Artigo em Inglês | MEDLINE | ID: mdl-38474917

RESUMO

The ubiquity of sensors in smart-homes facilitates the support of independent living for older adults and enables cognitive assessment. Notably, there has been a growing interest in utilizing movement traces for identifying signs of cognitive impairment in recent years. In this study, we introduce an innovative approach to identify abnormal indoor movement patterns that may signal cognitive decline. This is achieved through the non-intrusive integration of smart-home sensors, including passive infrared sensors and sensors embedded in everyday objects. The methodology involves visualizing user locomotion traces and discerning interactions with objects on a floor plan representation of the smart-home, and employing different image descriptor features designed for image analysis tasks and synthetic minority oversampling techniques to enhance the methodology. This approach distinguishes itself by its flexibility in effortlessly incorporating additional features through sensor data. A comprehensive analysis, conducted with a substantial dataset obtained from a real smart-home, involving 99 seniors, including those with cognitive diseases, reveals the effectiveness of the proposed functional prototype of the system architecture. The results validate the system's efficacy in accurately discerning the cognitive status of seniors, achieving a macro-averaged F1-score of 72.22% for the two targeted categories: cognitively healthy and people with dementia. Furthermore, through experimental comparison, our system demonstrates superior performance compared with state-of-the-art methods.

Assuntos

Transtornos Cognitivos , Disfunção Cognitiva , Humanos , Idoso , Disfunção Cognitiva/diagnóstico , Vida Independente , Cognição , Mineração de Dados

15.

Analysis of research topics and trends in investigator-initiated research/trials (IIRs/IITs): A topic modeling study.

Huang, Litao; Shi, Fanfan; Hu, Dan; Kang, Deying.

Medicine (Baltimore) ; 103(10): e37375, 2024 Mar 08.

Artigo em Inglês | MEDLINE | ID: mdl-38457583

RESUMO

BACKGROUND: With the exponential growth of publications in the field of investigator-initiated research/trials (IIRs/IITs), it has become necessary to employ text mining and bibliometric analysis as tools for gaining deeper insights into this area of study. By using these methods, researchers can effectively identify and analyze research topics within the field. METHODS: This study retrieved relevant publications from the Web of Science Core Collection and conducted bioinformatics analysis. The latent Dirichlet allocation model, which is based on machine learning, was utilized to identify subfield research topics. RESULTS: A total of 4315 articles related to IIRs/IITs were obtained from the Web of Science Core Collection. After excluding duplicates and articles with missing abstracts, a final dataset of 3333 articles was included for bibliometric analysis. The number of publications showed a steady increase over time, particularly since 2000. The United States, Germany, the United Kingdom, the Netherlands, Canada, Denmark, Japan, Switzerland, and France emerged as the most productive countries in terms of IIRs/IITs. The citation analysis revealed intriguing trends, with certain highly cited articles showing a significant increase in citation frequency in recent years. A model with 45 topics was deemed the best fit for characterizing the extensively researched fields within IIRs/IITs. Our analysis revealed 10 top topics that have garnered significant attention, spanning domains such as community health, cancer treatment, brain development and disease mechanisms, nursing research, and stem cell therapy. These top topics offer researchers valuable directions for further investigation and innovation. Additionally, we identified 12 hot topics, which represent the most cutting-edge and highly regarded research areas within the field. CONCLUSION: This study contributes to a comprehensive understanding of the current research landscape and provides valuable insights for researchers working in this domain.

Assuntos

Bibliometria , Biologia Computacional , Humanos , Canadá , Mineração de Dados , França

16.

A data mining approach to analyze the role of biomacromolecules-based nanocomposites in sustainable packaging.

Paul, John; Jacob, Jeeja; Mahmud, Md; Vaka, Mahesh; Krishnan, Syam G; Arifutzzaman, A; Thesiya, Dignesh; Xiong, Teng; Kadirgama, K; Selvaraj, Jeyraj.

Int J Biol Macromol ; 265(Pt 2): 130850, 2024 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-38492706

RESUMO

Recent decades have witnessed a surge in research interest in bio-nanocomposite-based packaging materials, but still, a lack of systematic analysis exists in this domain. Bio-based packaging materials pose a sustainable alternative to petroleum-based packaging materials. The current work employs bibliometric analysis to deliver a comprehensive outline on the role of bio nanocomposites in packaging. India, Iran, and China were revealed to be the top three nations actively engaged in this domain in total publications. Islamic Azad University in Iran and Universiti Putra Malaysia in Malaysia are among the world's best institutions in active research and publications in this field. The extensive collaboration between nations and institutions highlights the significance of a holistic approach towards bio-nanocomposite. The National Natural Science Foundation of China is the leading funding body in this field of research. Among authors, Jong whan Rhim secured the topmost citations (2234) in this domain (13 publications). Among journals, Carbohydrate Polymers secured the maximum citation count (4629) from 36 articles; the initial one was published in 2011. Bio nanocomposite is the most frequently used keyword. Researchers and policymakers focussing on sustainable packaging solutions will gain crucial insights on the current research status on packaging solutions using bio-nanocomposites from the conclusions.

Assuntos

Bibliometria , Nanocompostos , Humanos , Publicações , Embalagem de Produtos , Mineração de Dados

17.

Investigating acupoint selection and combinations of acupuncture for primary idiopathic tinnitus using data mining.

Huang, Liangliang; Fan, Yushan; Lin, Rui; Zhao, Yiping; Mo, Yaru; Luo, Sen; Li, Zhan.

Medicine (Baltimore) ; 103(12): e37107, 2024 Mar 22.

Artigo em Inglês | MEDLINE | ID: mdl-38518013

RESUMO

BACKGROUND: Acupuncture is widely used in the treatment of tinnitus worldwide because of its good efficacy and safety. However, the criteria for selecting acupoint prescriptions and combinations have not been summarized. Therefore, data mining was used herein to determine the treatment principles and the most effective acupoint selection for the treatment of idiopathic tinnitus. METHODS: The clinical research literature of acupuncture in the treatment of idiopathic tinnitus from the establishment of the database to September 1, 2023 in China National Knowledge Infrastructure, China Medical Journal Full-text Database, PubMed, Embase, Cochrane Library and Web of Science databases was retrieved and extracted. Microsoft Excel 2016 was used to establish the acupoint prescription database and the frequency statistics of acupoints, meridians and specific acupoints were carried out. IBM SPSS Statistics 25.0 software was used for cluster analysis of acupoints, and IBM SPSS Modeler18.0 software was used for association rule analysis of acupoints. RESULTS: A total of 112 articles were included, involving 221 acupuncture prescriptions, including 99 acupoints, with a total frequency of 1786 times. The 5 most frequently used acupoints were Tinggong (SI19), Tinghui (GB2), Yifeng (TE17), Ermen (TE21), and Zhongzhu (TE3). The commonly used meridians were Sanjiao meridian of hand-shaoyang, Gallbladder meridian of foot-shaoyang and Small intestine meridian of hand-taiyang. The specific points are mostly Crossing point, Five-shu point and Yuan-primary point. The core acupoint combination of association rules was Ermen (TE21)-Tinggong (SI19)-Tinghui (GB2)-Yifeng (TE17), and 3 effective clustering groups were obtained by cluster analysis of high-frequency acupoints. CONCLUSION: In this study, the published literature on acupuncture treatment of idiopathic tinnitus was analyzed by data mining, and the relationship between acupoints was explored, which provided a more wise choice for clinical acupuncture treatment of idiopathic tinnitus.

Assuntos

Terapia por Acupuntura , Meridianos , Zumbido , Humanos , Pontos de Acupuntura , Zumbido/terapia , Mineração de Dados

18.

Application of Data Mining Technology in the Screening for Gallbladder Stones: A Cross-Sectional Retrospective Study of Chinese Adults.

Wang, Shuang; Bao, Chenhui; Pei, Dongmei.

Yonsei Med J ; 65(4): 210-216, 2024 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-38515358

RESUMO

PURPOSE: The purpose of this study was to use data mining methods to establish a simple and reliable predictive model based on the risk factors related to gallbladder stones (GS) to assist in their diagnosis and reduce medical costs. MATERIALS AND METHODS: This was a retrospective cross-sectional study. A total of 4215 participants underwent annual health examinations between January 2019 and December 2019 at the Physical Examination Center of Shengjing Hospital Affiliated to China Medical University. After rigorous data screening, the records of 2105 medical examiners were included for the construction of J48, multilayer perceptron (MLP), Bayes Net, and Naïve Bayes algorithms. A ten-fold cross-validation method was used to verify the recognition model and determine the best classification algorithm for GS. RESULTS: The performance of these models was evaluated using metrics of accuracy, precision, recall, F-measure, and area under the receiver operating characteristic curve. Comparison of the F-measure for each algorithm revealed that the F-measure values for MLP and J48 (0.867 and 0.858, respectively) were not statistically significantly different (p>0.05), although they were significantly higher than the F-measure values for Bayes Net and Naïve Bayes (0.824 and 0.831, respectively; p<0.05). CONCLUSION: The results of this study showed that MLP and J48 algorithms are effective at screening individuals for the risk of GS. The key attributes of data mining can further promote the prevention of GS through targeted community intervention, improve the outcome of GS, and reduce the burden on the medical system.

Assuntos

Algoritmos , Vesícula Biliar , Adulto , Humanos , Estudos Retrospectivos , Estudos Transversais , Teorema de Bayes , Mineração de Dados/métodos

19.

QTL Mapping and Data Mining to Identify Genes Associated with Soybean Epicotyl Length Using Cultivated Soybean and Wild Soybean.

Chen, Lin; Ma, Shengnan; Li, Fuxin; Li, Lanxin; Yu, Wenjun; Yu, Lin; Tang, Chunshuang; Liu, Chunyan; Xin, Dawei; Chen, Qingshan; Wang, Jinhui.

Int J Mol Sci ; 25(6)2024 Mar 14.

Artigo em Inglês | MEDLINE | ID: mdl-38542270

RESUMO

Soybean (Glycine max) plants first emerged in China, and they have since been established as an economically important oil crop and a major source of daily protein for individuals throughout the world. Seed emergence height is the first factor that ensures seedling adaptability to field management practices, and it is closely related to epicotyl length. In the present study, the Suinong 14 and ZYD00006 soybean lines were used as parents to construct chromosome segment substitution lines (CSSLs) for quantitative trait loci (QTL) identification. Seven QTLs were identified using two years of epicotyl length measurement data. The insertion region of the ZYD00006 fragment was identified through whole genome resequencing, with candidate gene screening and validation being performed through RNA-Seq and qPCR, and Glyma.08G142400 was ultimately selected as an epicotyl length-related gene. Through combined analyses of phenotypic data from the study population, Glyma.08G142400 expression was found to be elevated in those varieties exhibiting longer epicotyl length. Haplotype data analyses revealed that epicotyl data were consistent with haplotype typing. In summary, the QTLs found to be associated with the epicotyl length identified herein provide a valuable foundation for future molecular marker-assisted breeding efforts aimed at improving soybean emergence height in the field, with the Glyma.08G142400 gene serving as a regulator of epicotyl length, offering new insight into the mechanisms that govern epicotyl development.

Assuntos

Soja , Locos de Características Quantitativas , Humanos , Soja/genética , Mapeamento Cromossômico , Melhoramento Vegetal , Sementes/metabolismo , Mineração de Dados

20.

COVID-19 outbreaks surveillance through text mining applied to electronic health records.

Rocha, Hermano Alexandre Lima; Solha, Erik Zarko Macêdo; Furtado, Vasco; Justino, Francion Linhares; Barreto, Lucas Arêa Leão; da Silva, Ronaldo Guedes; de Oliveira, Ítalo Martins; Bates, David Westfall; de Góes Cavalcanti, Luciano Pamplona; Lima Neto, Antônio Silva; de Oliveira, Erneson Alves.

BMC Infect Dis ; 24(1): 359, 2024 Mar 28.

Artigo em Inglês | MEDLINE | ID: mdl-38549109

RESUMO

BACKGROUND: The COVID-19 pandemic has caused significant disruptions to everyday life and has had social, political, and financial consequences that will persist for years. Several initiatives with intensive use of technology were quickly developed in this scenario. However, technologies that enhance epidemiological surveillance in contexts with low testing capacity and healthcare resources are scarce. Therefore, this study aims to address this gap by developing a data science model that uses routinely generated healthcare encounter records to detect possible new outbreaks early in real-time. METHODS: We defined an epidemiological indicator that is a proxy for suspected cases of COVID-19 using the health records of Emergency Care Unit (ECU) patients and text mining techniques. The open-field dataset comprises 2,760,862 medical records from nine ECUs, where each record has information about the patient's age, reported symptoms, and the time and date of admission. We also used a dataset where 1,026,804 cases of COVID-19 were officially confirmed. The records range from January 2020 to May 2022. Sample cross-correlation between two finite stochastic time series was used to evaluate the models. RESULTS: For patients with age 18 years, we find time-lag () = 72 days and cross-correlation () ~ 0.82, = 25 days and ~ 0.93, and = 17 days and ~ 0.88 for the first, second, and third waves, respectively. CONCLUSIONS: In conclusion, the developed model can aid in the early detection of signs of possible new COVID-19 outbreaks, weeks before traditional surveillance systems, thereby anticipating in initiating preventive and control actions in public health with a higher likelihood of success.

Assuntos

COVID-19 , Humanos , Adolescente , COVID-19/epidemiologia , Registros Eletrônicos de Saúde , Pandemias , Surtos de Doenças , Mineração de Dados

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA